Sousse Governorate
Enhancing failure prediction in nuclear industry: Hybridization of knowledge- and data-driven techniques
Saley, Amaratou Mahamadou, Moyaux, Thierry, Sekhari, Aïcha, Cheutet, Vincent, Danielou, Jean-Baptiste
The convergence of the Internet of Things (IoT) and Industry 4.0 has significantly enhanced data-driven methodologies within the nuclear industry, notably enhancing safety and economic efficiency. This advancement challenges the precise prediction of future maintenance needs for assets, which is crucial for reducing downtime and operational costs. However, the effectiveness of data-driven methodologies in the nuclear sector requires extensive domain knowledge due to the complexity of the systems involved. Thus, this paper proposes a novel predictive maintenance methodology that combines data-driven techniques with domain knowledge from a nuclear equipment. The methodological originality of this paper is located on two levels: highlighting the limitations of purely data-driven approaches and demonstrating the importance of knowledge in enhancing the performance of the predictive models. The applicative novelty of this work lies in its use within a domain such as a nuclear industry, which is highly restricted and ultrasensitive due to security, economic and environmental concerns. A detailed real-world case study which compares the current state of equipment monitoring with two scenarios, demonstrate that the methodology significantly outperforms purely data-driven methods in failure prediction. While purely data-driven methods achieve only a modest performance with a prediction horizon limited to 3 h and a F1 score of 56.36%, the hybrid approach increases the prediction horizon to 24 h and achieves a higher F1 score of 93.12%.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Europe > Italy (0.04)
- (10 more...)
- Overview (1.00)
- Workflow (0.93)
- Research Report > New Finding (0.67)
DART: A Structured Dataset of Regulatory Drug Documents in Italian for Clinical NLP
Barone, Mariano, Laudante, Antonio, Riccio, Giuseppe, Romano, Antonio, Postiglione, Marco, Moscato, Vincenzo
The extraction of pharmacological knowledge from regulatory documents has become a key focus in biomedical natural language processing, with applications ranging from adverse event monitoring to AI-assisted clinical decision support. However, research in this field has predominantly relied on English-language corpora such as DrugBank, leaving a significant gap in resources tailored to other healthcare systems. To address this limitation, we introduce DART (Drug Annotation from Regulatory Texts), the first structured corpus of Italian Summaries of Product Characteristics derived from the official repository of the Italian Medicines Agency (AIFA). The dataset was built through a reproducible pipeline encompassing web-scale document retrieval, semantic segmentation of regulatory sections, and clinical summarization using a few-shot-tuned large language model with low-temperature decoding. DART provides structured information on key pharmacological domains such as indications, adverse drug reactions, and drug-drug interactions. To validate its utility, we implemented an LLM-based drug interaction checker that leverages the dataset to infer clinically meaningful interactions. Experimental results show that instruction-tuned LLMs can accurately infer potential interactions and their clinical implications when grounded in the structured textual fields of DART. We publicly release our code on GitHub: https://github.com/PRAISELab-PicusLab/DART.
- Europe > Italy > Campania > Naples (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > Illinois > Cook County > Evanston (0.04)
- (6 more...)
CogGNN: Cognitive Graph Neural Networks in Generative Connectomics
Soussia, Mayssa, Lin, Yijun, Mahjoub, Mohamed Ali, Rekik, Islem
Generative learning has advanced network neuroscience, enabling tasks like graph super-resolution, temporal graph prediction, and multimodal brain graph fusion. However, current methods, mainly based on graph neural networks (GNNs), focus solely on structural and topological properties, neglecting cognitive traits. To address this, we introduce the first cognified generative model, CogGNN, which endows GNNs with cognitive capabilities (e.g., visual memory) to generate brain networks that preserve cognitive features. While broadly applicable, we present CogGNN, a specific variant designed to integrate visual input, a key factor in brain functions like pattern recognition and memory recall. As a proof of concept, we use our model to learn connectional brain templates (CBTs), population-level fingerprints from multi-view brain networks. Unlike prior work that overlooks cognitive properties, CogGNN generates CBTs that are both cognitively and structurally meaningful. Our contributions are: (i) a novel cognition-aware generative model with a visual-memory-based loss; (ii) a CBT-learning framework with a co-optimization strategy to yield well-centered, discriminative, cognitively enhanced templates. Extensive experiments show that CogGNN outperforms state-of-the-art methods, establishing a strong foundation for cognitively grounded brain network modeling.
- Africa > Middle East > Tunisia > Sousse Governorate > Sousse (0.05)
- South America > Peru > Lima Department > Lima Province > Lima (0.04)
- North America (0.04)
- (2 more...)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.48)
HydroVision: Predicting Optically Active Parameters in Surface Water Using Computer Vision
Deshmukh, Shubham Laxmikant, Wilchek, Matthew, Batarseh, Feras A.
Ongoing advancements in computer vision, particularly in pattern recognition and scene classification, have enabled new applications in environmental monitoring. Deep learning now offers non-contact methods for assessing water quality and detecting contamination, both critical for disaster response and public health protection. This work introduces HydroVision, a deep learning-based scene classification framework that estimates optically active water quality parameters including Chlorophyll-Alpha, Chlorophylls, Colored Dissolved Organic Matter (CDOM), Phycocyanins, Suspended Sediments, and Turbidity from standard Red-Green-Blue (RGB) images of surface water. HydroVision supports early detection of contamination trends and strengthens monitoring by regulatory agencies during external environmental stressors, industrial activities, and force majeure events. The model is trained on more than 500,000 seasonally varied images collected from the United States Geological Survey Hydrologic Imagery Visualization and Information System between 2022 and 2024. This approach leverages widely available RGB imagery as a scalable, cost-effective alternative to traditional multispectral and hyperspectral remote sensing. Four state-of-the-art convolutional neural networks (VGG-16, ResNet50, MobileNetV2, DenseNet121) and a Vision Transformer are evaluated through transfer learning to identify the best-performing architecture. DenseNet121 achieves the highest validation performance, with an R2 score of 0.89 in predicting CDOM, demonstrating the framework's promise for real-world water quality monitoring across diverse conditions. While the current model is optimized for well-lit imagery, future work will focus on improving robustness under low-light and obstructed scenarios to expand its operational utility.
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > Wisconsin (0.04)
- North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
- (8 more...)
Multi-Sensory Cognitive Computing for Learning Population-level Brain Connectivity
Soussia, Mayssa, Mahjoub, Mohamed Ali, Rekik, Islem
The generation of connectional brain templates (CBTs) has recently garnered significant attention for its potential to identify unique connectivity patterns shared across individuals. However, existing methods for CBT learning such as conventional machine learning and graph neural networks (GNNs) are hindered by several limitations. These include: (i) poor interpretability due to their black-box nature, (ii) high computational cost, and (iii) an exclusive focus on structure and topology, overlooking the cognitive capacity of the generated CBT. To address these challenges, we introduce mCOCO (multi-sensory COgnitive COmputing), a novel framework that leverages Reservoir Computing (RC) to learn population-level functional CBT from BOLD (Blood-Oxygen-level-Dependent) signals. RC's dynamic system properties allow for tracking state changes over time, enhancing interpretability and enabling the modeling of brain-like dynamics, as demonstrated in prior literature. By integrating multi-sensory inputs (e.g., text, audio, and visual data), mCOCO captures not only structure and topology but also how brain regions process information and adapt to cognitive tasks such as sensory processing, all in a computationally efficient manner. Our mCOCO framework consists of two phases: (1) mapping BOLD signals into the reservoir to derive individual functional connectomes, which are then aggregated into a group-level CBT - an approach, to the best of our knowledge, not previously explored in functional connectivity studies - and (2) incorporating multi-sensory inputs through a cognitive reservoir, endowing the CBT with cognitive traits. Extensive evaluations show that our mCOCO-based template significantly outperforms GNN-based CBT in terms of centeredness, discriminativeness, topological soundness, and multi-sensory memory retention. Our source code is available at https://github.com/basiralab/mCOCO.
- Africa > Middle East > Tunisia > Sousse Governorate > Sousse (0.05)
- North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
- North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.04)
- (4 more...)
Agentic AI Frameworks: Architectures, Protocols, and Design Challenges
Derouiche, Hana, Brahmi, Zaki, Mazeni, Haithem
Aspect Traditional AI agents Modern agentic AI systems (LLM-based agents) Definition Autonomous entities with fixed sensing/acting loops; limited by static rules or models Autonomous reasoning systems using LLMs with dynamic behavior, tool orchestration, and context-awarenessAutonomy Limited autonomy; often dependent on human input or predefined instructions High autonomy; capable of independently performing complex and extended tasks Goal Management Focused on single, static goals or fixed task planning Capable of managing multiple, evolving, and nested goals adaptivelyArchitecture Rule-based or BDI (Belief-Desire-Intention) models; monolithic design Modular architecture centered on LLMs, with components for memory, tools, context injection, and rolesAdaptability Suited to controlled, predictable environments; poor generalization Designed for open, dynamic, and unpredictable environmentsDecision-Making Deterministic or rule-based logic; symbolic reasoning Context-sensitive, probabilistic reasoning with adaptive planning and self-reflection Learning Mechanism Rule-based or supervised learning with limited updates Self-supervised and reinforcement learning; continual fine-tuning possible Context Handling Static or manually coded states and rules Dynamic context injection via agent protocols (e.g., MCP, A2A) and runtime awareness Communication Message-passing via ACL or KQML Real-time, event-driven collaboration; natural language interfacesTool Use Limited or predefined tools and actions Dynamic tool invocation, chaining, and API calling based on contextMemory Optional, often hardcoded or task-specific Integrated memory systems supporting long-and short-term information retention
- Africa > Middle East > Tunisia > Tunis Governorate > Tunis (0.04)
- Africa > Middle East > Tunisia > Sousse Governorate > Sousse (0.04)
- Africa > Middle East > Tunisia > Manouba Governorate > Manouba (0.04)
- Africa > Middle East > Tunisia > Kairouan Governorate > Kairouan (0.04)
- Overview (1.00)
- Research Report (0.82)
- Information Technology (0.70)
- Health & Medicine (0.47)
Leveraging Novel Ensemble Learning Techniques and Landsat Multispectral Data for Estimating Olive Yields in Tunisia
Kefi, Mohamed, Pham, Tien Dat, Nguyen, Thin, Tjoelker, Mark G., Devasirvatham, Viola, Kashiwagi, Kenichi
Olive production is an important tree crop in Mediterranean climates. However, olive yield varies significantly due to climate change. Accurately estimating yield using remote sensing and machine learning remains a complex challenge. In this study, we developed a streamlined pipeline for olive yield estimation in the Kairouan and Sousse governorates of Tunisia. We extracted features from multispectral reflectance bands, vegetation indices derived from Landsat-8 OLI and Landsat-9 OLI-2 satellite imagery, along with digital elevation model data. These spatial features were combined with ground-based field survey data to form a structured tabular dataset. We then developed an automated ensemble learning framework, implemented using AutoGluon to train and evaluate multiple machine learning models, select optimal combinations through stacking, and generate robust yield predictions using five-fold cross-validation. The results demonstrate strong predictive performance from both sensors, with Landsat-8 OLI achieving R2 = 0.8635 and RMSE = 1.17 tons per ha, and Landsat-9 OLI-2 achieving R2 = 0.8378 and RMSE = 1.32 tons per ha. This study highlights a scalable, cost-effective, and accurate method for olive yield estimation, with potential applicability across diverse agricultural regions globally.
- North America > United States (0.68)
- Africa > Middle East > Tunisia > Kairouan Governorate > Kairouan (0.27)
- Africa > Middle East > Tunisia > Sousse Governorate > Sousse (0.27)
- (11 more...)
On the Origins of Sampling Bias: Implications on Fairness Measurement and Mitigation
Zhioua, Sami, Binkyte, Ruta, Ouni, Ayoub, Ktata, Farah Barika
Accurately measuring discrimination is crucial to faithfully assessing fairness of trained machine learning (ML) models. Any bias in measuring discrimination leads to either amplification or underestimation of the existing disparity. Several sources of bias exist and it is assumed that bias resulting from machine learning is born equally by different groups (e.g. females vs males, whites vs blacks, etc.). If, however, bias is born differently by different groups, it may exacerbate discrimination against specific sub-populations. Sampling bias, in particular, is inconsistently used in the literature to describe bias due to the sampling procedure. In this paper, we attempt to disambiguate this term by introducing clearly defined variants of sampling bias, namely, sample size bias (SSB) and underrepresentation bias (URB). Through an extensive set of experiments on benchmark datasets and using mainstream learning algorithms, we expose relevant observations in several model training scenarios. The observations are finally framed as actionable recommendations for practitioners.
- North America > United States (0.14)
- Africa > Middle East > Tunisia > Sousse Governorate > Sousse (0.04)
- Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
- (3 more...)
Advancements in Natural Language Processing for Automatic Text Summarization
Jayatilleke, Nevidu, Weerasinghe, Ruvan, Senanayake, Nipuna
The substantial growth of textual content in diverse domains and platforms has led to a considerable need for Automatic Text Summarization (ATS) techniques that aid in the process of text analysis. The effectiveness of text summarization models has been significantly enhanced in a variety of technical domains because of advancements in Natural Language Processing (NLP) and Deep Learning (DL). Despite this, the process of summarizing textual information continues to be significantly constrained by the intricate writing styles of a variety of texts, which involve a range of technical complexities. Text summarization techniques can be broadly categorized into two main types: abstractive summarization and extractive summarization. Extractive summarization involves directly extracting sentences, phrases, or segments of text from the content without making any changes. On the other hand, abstractive summarization is achieved by reconstructing the sentences, phrases, or segments from the original text using linguistic analysis. Through this study, a linguistically diverse categorizations of text summarization approaches have been addressed in a constructive manner. In this paper, the authors explored existing hybrid techniques that have employed both extractive and abstractive methodologies. In addition, the pros and cons of various approaches discussed in the literature are also investigated. Furthermore, the authors conducted a comparative analysis on different techniques and matrices to evaluate the generated summaries using language generation models. This survey endeavors to provide a comprehensive overview of ATS by presenting the progression of language processing regarding this task through a breakdown of diverse systems and architectures accompanied by technical and mathematical explanations of their operations.
- Asia > Singapore (0.04)
- Asia > Sri Lanka (0.04)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- (5 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.46)
VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models
Kumar, Gokul Karthik, Chaabane, Iheb, Wu, Kebin
Vision-language models (VLMs) excel in various visual benchmarks but are often constrained by the lack of high-quality visual fine-tuning data. To address this challenge, we introduce VisCon-100K, a novel dataset derived from interleaved image-text web documents. Our approach transforms 45K web documents from the OBELICS dataset into 100K image conversation samples. We utilize GPT-4V to generate image-contextual captions and OpenChat 3.5 model to convert these captions into diverse free-form and multiple-choice question-answer pairs. Integrating this dataset for fine-tuning considerably enhances VLM performance across multiple benchmarks. Unlike methods that focus solely on fine-grained visual content, our approach leverages accompanying web context, yielding superior results. We also discover that a `leaky modality mix,' where conversation samples contain questions answerable from both the image and its contextual caption, outperforms non-leaky combinations of captions and Q\&A pairs. VisCon-100k dataset shows strong performance with two popular VLM approaches: text-only large language model (LLM) aligned with a vision encoder using image captions data (ShareGPT4V-7b) and multimodally pretrained LLM (IDEFICS2-8b) using interleaved image-text data. In addition to releasing the VisCon-100K dataset, we provide a contextual captioner trained on this dataset, facilitating scalable fine-tuning data generation for future research and open-source applications. Using the same pipeline, but substituting our trained contextual captioner for GPT-4V, we also release the larger VisCon-1M dataset.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Africa > Middle East > Tunisia > Sousse Governorate > Sousse (0.05)
- Europe > United Kingdom (0.04)
- (4 more...)